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Abstract 

In this paper we propose a simple and efficient data structure yielding a perfect 
hashing of quite general arrays. The data structure is named phorma, which is an 
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1 Motivation 



Let a = a±a2 ■ ■ ■ a n and a = a\a,2 ■ ■ ■ a n be n-sequences of positive integers, a < a, 
meaning aij < ctj, i = 1, 2, . . . ,rt. Suppose that f(a) is a symmetric function on the 
variables a,, that is, the value of f{a) does not change if the coordinates of a are 
permuted in an arbitrary way. To store the function /, it is enough to allocate space 
for the values of /(a), where > a^+i, 1 < i < n — 1. Thus, we need to enumerate 
the a's satisfying a < a and the boolean function 

B s^n = («i > "2) A (a 2 > az 3 ) A . . . A . . . (a n _i > a n ). 



The motivation for this work is to enumerate and give a perfect hash function 
[2,4] for multidimensional arrays which have order restrictions on their entries. The 
simplest example of this situation is when the restrictions are given by -B™^. We 
show that quite general boolean functions can take the place of and that the 
large class of enumerative/perfect hash associated problems can be put under a 
common framework. 
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Fig. 1. The L-piece 

To exemplify the appearance of a more complex boolean function, consider the 
problem of efficiently enumerate all the L-shaped pieces with vertices which fit 
in a (p x q) integer grid. This is a typical situation treated in [7]. An L-shaped 
piece is a rectangle R from which we have removed a smaller rectangle r C R. 
Moreover R and r have a corner in common. By effecting rotations, translations 
and reflections we may suppose that our L-shaped piece has a corner in the origin 
and the common vertex to r and R is the vertex opposite to the origin in rectangle 
R. Positioned in this way, the L-piece is represented by a quadruple of positive 
integers (X,Y,x,y) = 01020304 < 01020304 = (p,q,p,q), as in Figure 1. 

The geometry imposes the restrictions: (1) X > x; (2) Y > y. Symmetry consider- 
ations enable us to partition the set of a-bounded L-pieces into equivalent classes 
and to distinguish a set A of representatives for these classes. For the occupancy 
purposes in [7], the L-pieces (X, Y, x, y) and (Y, X, y, x) must be considered equiva- 
lent. This implies the restrictions: (3) X > Y and (4) X = Y => x > y. In terms of 
occupancy, (X,Y, X,y) with y < X, which is a degenerated L, can (and must) be 
replaced by the rectangle (X, Y, X, Y). Analogously, (X,Y,x,Y) with x < X must 
be replaced by (X, Y, X, Y). In this way, the equivalence (X = x) 44> (Y = y) holds. 
The equivalence is rewritten as two opposite implications in the disguised form: (5) 
(I/i) V (Y = y) and (6) (Y ^ y) V {X = x). The restrictions (1) to (6) are 
gathered in a boolean expression Bi, in terms of the o^'s: 

Bl = ("i > "3) A (o 2 > 04) A (ai > o 2 ) A ((«i 7^ a 2 ) V (a 3 > a 4 )) A 
((ai/a 3 ) V (0(2 = 0:4)) A ((a 2 / 04) V (ai = a 3 )). 

So, we want to enumerate the 4-sequences a = 01020304 of positive integers a < a 
and satisfying B^. If, as it is typically needed in packing problems, a is of order 
(120, 100, 120, 100) = (120, 100) 2 then we have 23,094,225 o's that satisfies B L in 
a total of 144,000,000 possibilities. If a = (7, 5) 2 , then there is a total of 190 o's in 
1225 possibilities. The valid 190 o's are in 1 — 1 correspondence with the si-paths 
in the digraph of Figure 5. 



2 The Definition of Phorma and the Objective of the Work 

Let IN be the set of natural numbers, JN* = JN\{0} and N = {1, 2, . . . ,n}. For 
1 < m < n, define M = {1, . . . , m}. Let Y be the set of all functions from X into 
Y. Throughout this work, o = oi . . . a n is an n-sequence of positive integers, that 



is, a <G (1N*) N . The relation p' < p for sequences p' and p of equal length means 
that p^ < pi, for each i-term of the sequences. 

An n- composition 5 = S± . . . S m is an element of (IV*) M such that J2i< m < n = n. 
The set of n-compositions is denoted by C n . Given a, let m a be the number of 
distinct entries in a and mg be the length of 5. Let a = a± . . . a ma £ C n denote the 
n-composition where is the number of occurrences of the z-th smallest entry of 
a. 

An n-phorma is a triple P = (a, B, C) satisfying: (i) a = a\02 ■ ■ ■ a n € (JN*) N ; (ii) 
B is a boolean function whose literals of B are of type (ai ★ ay), where a € (1N*) N 
and * G {<, >, <, >, =, (hi) C C (7 n is a given set of n-compositions. The term 
n-phorma is an acronym for an n-dimensional perfectly Ziashable order restricted 
multidimensional array. 

The objective of this paper is to enumerate the set 

A(P) = A(a,B,C) = {a \ a < a, a satisfies B, a G C}. 

In the particular case that B is the empty boolean function, then there are no 
S-restrictions and A(a, B, C) is the subset of (W*) N consisting of all sequences 
a < a, a G C. We construct a bijection h : A(P) — ► {0, 1, ... , |A(P)| — 1}, so that 
both h and h~ l are efficiently computable. Such functions are called perfect hash 
functions [2,4]. Their usefulness is well known. 

As far as we know the problem of finding perfect hash functions for these quite 
general multidimensional arrays have not been considered before in the literature, 
whence the lack of more specific references and bibliography. Our solution is based 
on the theory of combinatorial families developed in [8]. Here we call these families 
NW-families and recall their definition in Section 4. The central idea is to associate 
a digraph to a collection of combinatorial objects in such a way that each object in 
the family is in 1 — 1 correspondence with a path in the digraph. A more detailed 
account of these combinatorial families appears in [9]. 

From a phorma (a, B, C) a digraph G(a, B, C) with a single source s and a single sink 
t can be constructed so that the elements in A(a, B, C) are in 1 — 1 correspondence 
with the st-paths. Indeed, G(a, B, C) is an NW-family [8] encoding A(a, B, C) with 
a simple perfect hash function h. We briefly review these families in Section 4. The 
digraph G((7, 5) 2 , Bl, C 4 ) associated to the phorma ((7, 5) 2 , Bl, C 4 ) is shown in 
Figure 5. In this example, the set C of 4-compositions is the whole set C 4 . 



3 More Applications of Phormas 

The need to impose order restrictions on arrays appears frequently and in many 
cases it is not difficult to express these restrictions as a phorma. For a larger example, 
consider the 7-phormas arising from the generation of T-shaped pieces. In Figure 
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Fig. 2. The T-pieces T x , T y and T z 

2 we show the three kinds of such a piece. They are composed of a 3-block and a 
3-D L-piece. In the case of the T 2 -piece, the L is truncated in one of its legs along 
the z-direction. These pieces are the 3D counterpart for the 2D L-shaped piece 
and they play an important role in 3D packing problems. They are described by 
seven parameters, which in the case of the T z -piece are, (x, X, y, Y, z, Z m , Z). To 
enumerate the T-pieces contained in a (p x q x r)-block was the motivating idea to 
formalize the notion of phorma. The need to effect this enumeration appears in [5]. 

As an example , for the T 2 -piece, the restrictions coming from the geometry and the 
symmetry on the seven parameters xXyY zZ m Z = a\a20Lj,a^a^a%a'j are of three 
types: 



(1) (X>x);(Y >y);(Z>Z m >z); 

(2) (X > Y); (X = Y)^(x> y); 

(3) ( x = X)^{z = Z m ); (y = Y)^(x = X)A(z = Z). 



The first type of restrictions is obvious. The second type expresses the fact that the 
T^-piece can be rotated around a vertical axis without modifying its containment 
properties. The X- and Indirections are equivalent. Other axis of rotations, implying 
similar restrictions, could be used if the boxes to be packed into the T^-piece could 
change its vertical. The third type of restrictions deals with the degenerated cases, 
in which the T 2 -piece becomes a simpler piece. In terms of a phorma type boolean 
function, the restrictions translate as a boolean function Bj, with the following 9 
clauses: 



.By = (a2 > a\) A (04 > as) A (aj > oq) A (ae > 05) A (02 > a^jh 
((a 2 / a 4 ) V (ai > a 3 )) A ((«i / a 2 ) V (a 5 = a 6 ))A 
((a 3 / a 4 ) V (ai = a 2 )) A ((a 3 / a 4 ) V (a 5 = a 7 )). 

In this case, once more, C is the whole set of 7-compositions C 7 . If, just to be specific, 
a= (15 2 17 2 19 3 ), then \A(a, B Z T , C 7 ) \ = 7,510,130, while 15 2 17 2 19 3 = 446,006,475. 
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The amount of memory required to store the digraph G(a, Bj,, C 7 ) is logarithmically 
smaller than \A(a, Bj,, C 7 )| (see Figure 6) and its construction takes only a few 
seconds of computer time. Along the same line, we can derive boolean functions 
Bj\ and Bj, for the other T-pieces T x and T y shown in Figure 2. The three T- 
pieces are inequivalent under reflections and rotations which maintain the vertical 
direction. They play a complementary role in 3-D packing problems in which the 
vertical direction of the boxes to be packed must be preserved. 

We briefly mention another application of phorma: finding all the solutions for Cube 
It. Let x < y < z be real numbers. Consider the problem of finding all maximum 
packings of (x x y x z)-bricks into a cube of side x + y + z. liy + z < 3x, then 27 is 
an upper bound on the number of bricks that can be packed, see [3]. There exists 
a phorma (3 81 , Bg ube , cf 7 ) of dimension 81 such that A(3 S1 , Bf t ube , cf 7 ) has 1008 
elements coinciding with the 1008 distinct solutions for the problem of packing 
the maximum of 27 boxes. In this case, a is the sequence of 81 repetitions of 3, 
a = 3333... 3 and C = {cf 7 }, where cf 7 = (27,27,27). The expression for Bf t ube 
and its justification are too long to be included in this paper. A higher dimensional 
analogue of Bf t ube relates to an interesting open problem which is the subject of 
ongoing research: how to pack 5 5 = 3125 {axbxcxdx e)-boxes into a 5-cube 
of side a + b + c + d + e. Our implementation (not yet optimized) of the phorma 
(3 81 , Bf t ube , c\ 7 ) found the 1008 solutions in about a day of computer time. What is 
interesting to mention, is that there are no symmetries in these 1008 solutions. So, 
their set can be partitioned into 21 classes of 48 elements each, corresponding to the 
symmetry group of the cube. Representatives of these 21 classes are given in Figure 
3. The bricks orientations are given by the conventions: a i— > yzx; A t— > zyx; b t— > xzy; 
B i— > zxy; c *—>■ xyz; C i— > yxz. The parameters of G(3 81 , Bf t ube , c| 7 ) are listed in 
Figure 6. In particular, H(3 81 , Bf t ube , c| 7 ) has only 4 vertices and the whole difficulty 
is to find [A(3 sl ,Bf t ube ,4 7 )\ which in this case coincides with A(3 81 , Bf t ube , c| 7 ). 



4 NW-Families 



The following concept, introduced in [8], is the central tool for our hashing scheme. A 
Nijenhuis- Wilf combinatorial family, or simply an NW- family, is an acyclic digraph 
G whose vertex set is denoted by V(G), having the properties below: 

(1) V(G) has a partial order (for x,y £ V{G), y ^ x if there is a directed path 
from x to y) with a unique minimal element t. For each v € V(G) the set 
{x G V(G) | x ^ v } is finite and includes t. 

(2) Every vertex v, except t has a strictly positive outvalence p(v). For each v G 
V(G), the set E(v) of outgoing edges has a v-local rank-label t v , < £ v (e) < 
p(v) - 1, e G E(v). 

A path starting at v and ending in t is encoded by the sequence of label-ranks of 
the sequence of its edges. Such a path is called an object of order v [8]. The beauty 
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Fig. 3. All the solutions for Cube It 



of this scheme is that we can perform various tasks on the family in an abstract 
way, without referring to the actual encoding/decoding of the objects as paths. An 
NW-family is especially suited to deal with the following 5 tasks. Tasks 1 to 4 are 
from [8]. Task is emphasized here because of its applicability to the phorma: we 
need to calibrate the cardinality of A(a, B, C) by choosing a in an adequate way. 

Task 0: counting: What is the family's cardinality? Algorithm: Given v € V(G), 
let \v\ = ^2{\head(e)\ \ e € E(G),tail(e) = v}. From this formula, \v\ is easily 
obtained by recursion. It is convenient to store it as an attribute of v € V(G) in a 
pre-processing phase, or compilation time. 

Task 1: sequencing: Given an object in the family, construct the "next" object. 
Algorithm: The next path of a given path tt in coded form is, in coded form, the 
lexicographic successor of tt. 

Task 2: ranking (perfect hashing): Given an object to in the family, find the integer 
h(oj) such that u> is the h(u>)-th element in the order induced by task 1. Algorithm: 
Let an element-path tt of order v of an NW-family, tt = (e±, e2, • • • , e p ) be given. The 
rank of tt is defined as h(7r) = Y^=i x( e «)> where x( e ) = *}2{\head(f)\ with £ v (f) < 
£ v (e),f eE(v)}. 

Task 3: unranking: Given an integer r, we need to construct the r-th path from v to t. 
Define pred v (e) as the highest-rank edge of the set {/ 6 E(v) \ £ v (f) < £ v (e)}, and 
let \head(pred v (e))\ = if this set is empty. The required r-th path ir r is generated 
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as follows. Algorithm: ir r <— 0; r' <— 0; v' <— v; repeat append to ir r the highest-rank 
edge e of E(v') such that r' + \head(pred v ' (e))| < r; r' <— r' + \head(pred v > (e))|; 
7/ <— head(e) until 7/ = i. 

Task 4: getting random object: Choose an object uniformly at random from the given 
family. Algorithm: Let £ € [0, 1] be uniformly chosen at random; return the (|v|*£)-th 
object. 



5 Reducing, Sorting, a-Roofing: the Digraph G(a,B,C) 

If a has m < n distinct entries, let M a = {1, . . . , m}. The reduction of a, denoted 
by [a\, is the unique surjection in (M a ) N which is order compatible with a. That 
is, for i G N, if a* is the j-th smallest entry in a, then [aji = j- Let also a? denote 
the m-sequence of distinct entries of a in ascending order. We call ct" the sorting of 
a. Given an ascending m-sequence 7, let m 7 = m. 

Proposition 1 XTie n-vector of positive integers a is recoverable from ( [a J , o>) . 
Proof: It is sufficient to observe that «j = o>^ a j.. ■ 

Since a induces the pair (LaJ,o^) and, by Proposition 1, is recoverable from, it we 
can think of a as the pair ( [a\ , cf) and write a = ( [a\ , o>) . 

For a G A(a, B, C) let the a-roof of a be [~a] a = 7* where 7* is the lexicographically 
maximal increasing m-sequence with the property that (Lo ; J)7*) £ A(a, B,C). In 
particular, o>j < \a~\f = 7*, i £ N. 

Proposition 2 T/ie a-roof of a, \a] a , does not depend on a itself but only on [a\ 
and a, in the sense that \a\ a = \[a\] a . 

Proof: The a-roof \ct\ a = 7*72 • • • 7m can De constructed as follows. Suppose 
that, for 1 < i < m, i occurs at positions pn, . . . ,pij i of \_a\. Then we must have 
7m = min {« Pm i , a Pm2 , a Pmjm }, due to a-dominance. For % = m - 1, m - 2, . . . , 1, 
the definition implies that 7* = min{ a Pil , . . . , a PiH , — 1}, by a-dominance and 
to insure the strict increase of 7*. Since the construction only depended on [aj and 
a, the Proposition is proved. ■ 

Given a phorma (a, B, C) and the corresponding A(a, B, C), three sets are defined: 
(i) [A(a, B, C)\ ={[a\ \ a € A{a, B, C)}, (it) A"(a, B,C) = {c>\ a G A(a, 5, C)}, 
(m) [A(a,5,C)l a = {[a] a | a G A(a,B,C)}. 

Usually, but not necessarily (see the phorma (3 81 , B^ ube , c^)), \[A(a, B,C)\\ is 
much smaller than \A(a,B,C)\. By Proposition 2, | \A(a, B, C)~] a \ < \ [A(a,B,C)\\. 
In general this inequality is also not tight. See examples in Figure 6. The perfect 
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hash function that is constructed for A(a,B,C) depends on the explicit enumera- 
tion of the set [A(a, B, C)\. This set, in the case of our ongoing example, has nine 
elements, 

|A((7, 5) 2 , B L , C 4 )J = {1111, 2121, 2211, 3211, 3221, 3321, 4231, 4312, 4321}. 
The a-roof set has only seven elements because of two duplicates 

\A((7,5) 2 ,B L ,C 4 )~] a = {5,57,45,457,457,345,4567,3457,3457}. 

Given a phorma (a, B, C) the digraph A(a, B, C) is defined as follows. Its vertex set 
is V(A(a,B,C)) = {s} U [A(a,B,C)\ U \A(a, B, C)] a , where s is a single source. 
It is a simple graph, and so, each of its directed edges can be represented by an 
ordered pair of vertices. For each [a\ G [A(a,B,C)\ there are an edge (s, |_aj) and 
an edge ( |_ckJ , [a]")- These are all the edges of A(a,B,C), concluding its defini- 
tion. The digraph A(a, B, C) is a subgraph of G(a,B,C). In Figure 5, the edges 
of A((7, 5) 2 , Bl, C 4 ) are depicted in dashed gray. The edges of its complement 
H((7, 5) 2 ,B L , C 4 ) in G((7, 5) 2 , B L , C 4 ) (which we define next) are depicted in solid 
lines. The number near a vertex v (the first number, when there are two) is the 
number of ft-paths in G((7, 5) 2 ,B L , C" 4 ). 

Let H°° be the set of all finite strictly increasing sequences of positive integers. The 
empty sequence is in H°° and is denoted by t. Suppose 7 = 7172 • • ■ 7m & H°°. We 
define an NW-family as follows. If 7 m > m let 7 denote the increasing sequence of 
length m satisfying 7 m = 7 m — 1 and 7* = min{7j + i — 1, 7,}, for i = m, m — 1, . . . , 1. 
If 7m = Tn, then 7 does not exist. If 7 ^ t, let ^7 be the sequence of length 
m — 1 obtained from 7 by removing its last entry: ^7 = 71 . . . 7 m _i. If 7 = t, then 
^7 does not exist. Given 7, 7 € we say that 7 -< 7, if there is a sequence 

(7 = 7 1 , 7 2 , . . . , 7 P = 7), with 7* G if 00 , such that, for each i = 1, 2, . . . ,p— 1, either 
7 l+1 = 7* or else 7 l+1 = « / 7*. The relation ^ makes H°° a partial ordered set, or poset. 
For 7 G let ff 7 be the acyclic digraph whose vertex set is V(H^) = {7' | 7' ^ 7}. 
From each vertex 7' G V(/f 7 ) there are at most two outgoing edges: (7', 7'), of 7'- 
local rank-label 0, if 7* exists and (7', ^y'), if ^7' exists. The 7'-local rank-label of 
this last edge is either 1, if 7' exists or otherwise. This concludes the definition of 

Given a path it from 7* to t in H y * , a fall of 7r is a vertex 7 such that the edge 
(7,^7) is used by ir. Path tt has exactly m 7 * falls. In Figure 4 the 4 falls of the 
path shown in thick edges are: 5678, 567, 34 and 3. The encoding/decoding of the 
increasing sequences 7 ■< 7* as paths in the NW-family H 7 * is particularly simple: 

Proposition 3 To a path it in H^* from 7* to t corresponds 7^ ^ 7* consisting 
of the last coordinates of the ir falls (in reverse order). Reciprocally, to 7 ^ 7*, 
corresponds the unique path 7r 7 from 7* to t such that the last entry of its i-th fall 
coincides with the i-th entry of 7. Moreover, 7r 7ir = ir, j n = 7. 

Proof: Straightforward from the definitions. ■ 
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1234 2345 3456 4567 5678 5679 




Fig. 4. Path vr in H 5679 with falls 5678, 567, 34 and 3 encoding 7 = 3478, ^(7) = 60 

Given a path tt from 7* to t in H~* , a post-fall of tt is a vertex 7' = 7 such that the 
edge (7, ^7) is used by 7r. Path 7r has at most m 7 * falls. The set of post-falls of ir is 
denoted PostFall{ir). In Figure 4, calling 7r the path shown in thick edges, we have 
PostFall(ir) = {4567,456,23,2} and their members are depicted as white vertices. 
The hash function h~* in the NW-family H-y* takes a simple form: 

Proposition 4 The perfect hash function associated with the NW-family H^* is 
V(7) = EilVI I i £ PostFall^)}. 

Proof: The result is an specialization of the rank function of a generic NW-family 
to Hry*. It follows directly from the definitions. ■ 

From this Proposition it follows in Figure 4 that /i 5679 (3478) = 35 + 20 + 3 + 2 = 60. 
The terms of the sum correspond to the orders of the white vertices, forming the 
set PostFall(TT3ijs)- 

Define H(a,B,C) = \J{H y * | 7* 6 \A(a, B, C)~\ a }. Actually, in this union we need 
only to take maximal 7*'s. If 7' H 7*, then Hy is a subgraph of H^* and it is 
irrelevant for the union. The digraph #(7575, B^, C 4 ) shown in Figure 5, is formed 
by the union of 4 maximal 7*'s: #3457 U H^qj U #457 U In general, the digraph 
of a phorma P = (a, B, C) is defined as G(a, B, C) = A(o, B, C) U H(a, B, C). 

In order to make G(a, B,C) an NW-family, we need to define the v- local rank labels 
of the v-outgoing edges for each vertex v of G(a, B,C). This can be accomplished 
by ordering lexicographically the elements of [A(a, B, C)\ and ranking them in the 
ascending order: 0, 1, . . . , | [A(a, B, C)\ \ — 1. The edge (s, \a\) gets as s-local rank 
the same rank as \a\. An edge of type ([a\, \a] a ) gets [a\ -local rank 0, because 
it is the unique [«J-outgoing edge. For 7 G V(H(a, B,C)) we have already defined 
the 7-local label-ranks. With these local ranks the two conditions of NW-family are 
satisfied by G(a, B,C). It remains to verify that its si-paths encode the elements of 
A(a,B,C): 

Theorem 1 (Main Theorem) For every phorma P = (a, B, C) the st-paths of 
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t 1 

Fig. 5. Digraph G(7575, Bl, C 4 ) encoding ,4(7575, B L , C 4 ) 



G(a, B,C) are in 1 — 1 correspondence with the elements of A(a, B,C). 

Proof: Given an a G A(a, £>, C), let [a] a = 7*. Define 7TQ, = (s, L a J)°(L a J ; I" ! 0- ) 
7T\, Reciprocally, given an st-path n in G(a, B, C), let /? be the second vertex of n, 
7* be its third vertex and 7 be such that it = (s, (3) o (/?, 7*) o7r 7 . Define a T = 7). 
These definitions imply 7r Q7r = 7r and a T(1 = a. ■ 



Given L^4( a ) B, C)\ ordered lexicographically and (3 £ [A(a, B,C)\ define \\0\\ = 
XX 1/3' I such that j3' < (3}. In Figure 5 the values of ||/3|| appear as the second 
number near each vertex (3. The hash function h for a phorma assumes a particularly 
simple expression: 

Proposition 5 Given a = (\a\,ci*) £ A(a, B,C), the perfect hash function h of 
G{a,B,C) is 

h{a) = || \ a\\ \ + hr a ia (a"). 

Proof: This value of h(a) follows from the general algorithm for ranking in an 
abstract NW-family, when specialized to phormas. ■ 
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6 Implementation Aspects 



The need of the boolean function B in a phorma (a, B, C) is just to enable the 
enumeration of [A(a,B,C)\. If the size of this set is small, then an explicit list of 
its elements, {/3 1 ,/? 2 , . . . ,/3 u }, can be given in place of B. If this is not the case, 
then a convenient way to input a generic phorma type boolean function is by means 
of a tree T(B) with three types of internal nodes: V-nodes, A-nodes, nodes. 
The leaves of the tree correspond to the basic constituent boolean functions of 
type on * a>j, where * G {<, >, <, >, =, 7^}. The nodes (negation operator) must 
have at most one child. Note that each subtree rooted at an internal o-node v 
(o G { V, A, -1}) is a boolean tree obtained by taking the o-operation of the boolean 
tree(s) corresponding to the children of v. Given an a, it is possible to decide its 
^-satisfiability, by evaluating from the leaves up and arriving to the root of T{B). 
See [1] for more details. 

We also admit two ways of inputting C: by means of an explicit list of its elements, 
{8 1 , <5 2 , . . . , 8 Z }, if z = \C\ is small, or by a phorma type of boolean restrictions on 
the coordinates of the <5's. In this case, C is itself a boolean expression with clauses 
of type (Si * Sj). In the case C = C n , this boolean expression is empty. We define 
an NW-family encoding \J neN *{C n }: consider the digraph L°°, whose vertex set is 
the set of points in the plane which have positive integer coordinates. There are at 
most two edges from a point (p, q) G V(L°°), namely a west edge ((p, q), (p — 1, q)), 
if p > 2, and a southwest edge ((p,q), (p — l,q — 1)), if p, q > 2. The (p, g)-local 
rank-label of the first edge is 0, if it exists, and the (p, g)-local rank-label of the 
second edge is 1, if both edges exist. In the case that only the second edge exists, 
then its (p, g)-local rank-label is 0. Let be the subset of C n of n-compositions 
which have length m. 

Proposition 6 The paths from (n, m) to (1, 1) in L°° are in 1 — 1 correspondence 
with the elements of C^. Thus L°° is an NW-family encoding the n-compositions 
for all n G IN. 

Proof: Let 5 G be given. Construct a path its from (n, m) to (0, 0) in L°° 
as follows. Let 6' *— 8 and ir' <— the empty path. Repeat n times: if 8[ > 1, then 
8[ <— 8[ — 1, extend n' with a west edge; if 5[ = 1, then 8' becomes 8' without its 
first part; extend ir' with a southwest edge. After the n iterations of this loop, 8' is 
the composition 1 of 1 in 1 part and define its = it' . Reciprocally, given a path n 
from (n, m) to (1, 1) in L°°, construct a 8 n G as follows. Let 8' *— 1 and it' <— it. 
For i = 1, 2, . . . , n do: if the z-th edge of it is a southwest edge, let 8' <— (1, 8'); if 
the i-th edge of tt is a west edge, let 8[ <— 8[ + 1. Define (5^ = 8'. These definitions 
imply 8 WS = 8 and that tt^ = tt, establishing a 1 — 1 correspondence between C™ 
and the paths from (n,m) to (1, 1) in L°°. ■ 



By using Proposition 6 it is possible to generate in an efficient way the <5's satisfying 
the boolean expression C via a C-restricted implicit enumeration based on L°°. 
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The crucial task to construct (at compiler time) the digraph G(a,B,C) is to ex- 
plicitly generate [A(a, B, C)\. Since a and \_a\ are order isomorphic, one possibility 
to produce [A(a, B, C)\ is to generate all the n n members of N N and to test each 
such sequence for reducibility and ^-satisfiability [2] . This simple minded approach 
is suitable for small dimension n. In our 4-dimensional phorma ((7, 5) 2 , Bl, C 4 ) there 
are only 256 tests to be made. When n increases this simple minded method becomes 
inapplicable. For example, for the 7-phorma (a, B^,C 7 ) there are 7 7 = 823,543 
tests to be made and a better approach is needed to generate the 1, 134 elements of 
[(A(a,B^,C 7 )\ as well as the 20 elements of \A(a,B,C 7 )] a (for a = (15 2 17 2 19 3 )). 
The basic idea is to implement a B -restricted implicit enumerating scheme which 
takes into account only reduced sequences in generating the set |_(-<4-( a > B, C)\ . This 
methodology extends substantially the realm of the phorma applicability. 

Given a phorma (a, B, C) and 5 G C. Define 

[A(a, B, C, <5)J = {a G [A(a, B, C)\ \a=\a\= 5}. 

As we know how to generate 5 G C, the generation of [A(a, B, C)\ reduces to the 
generation of each [A(a, B, C, S)\ , because (0 means disjoint union) 

[A(a,B,C)\ =0 5eC [A(a,B,C,S)\. 



The m- dimensional grid digraph J m is the digraph whose vertices are the points 
of R m with integer coordinates. There is an edge from p = p\ . . .pj . . .p m to q = 
q± . . . qj . . . q m if pi = qi except for i = j, where pj = q,j + 1. 

Proposition 7 An element of [A(a, B, C, 5)\ corresponds to a path from the point 
5 to the origin in J ms . 

Proof: Given (3 = P1P2 ■ ■ ■ (3m s £ [A(a, B,C,5)\ we define a path named 7rg in 
digraph J ms from S to the origin as follows. Path np starts at S and its i-th edge is 
the edge parallel to the Pi-th. axis. It follows from the definitions that 7173 finishes at 
the origin. ■ 



From Proposition 7 a B-restricted implicit enumeration scheme based on paths in 
J m , only produces reduced words. The construction of [A(a, B, C)\ , \A(a,B,C~\ a , 
and as a consequence, the construction of the digraph A(a, B, C) are efficiently 
performed in this way. 

Now we turn our attention to the construction and storage of the digraph H(a, B, C). 
Let L(r,m) = {7 G V(H(a,B,C)) \ 7 € (IV*) M , lm = r} and \A(a,B,C)] a max = 
{7* G \A(a, B,C)] a , 7* maximal}. 

Proposition 8 | L(r,m)\ < \\A(a,B,C)] a max \. 

Proof: For each element 7 G L(r, m) choose some 7* G [ A(a, B , C)]^ aa , such 
that 7^7*. This defines a function / from L(r, m) to \A(a, B, C)]^ aa ., given 
by /(7) = 7*- ^ is enough to prove that / is injective. Let 7 and 7' be distinct 
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elements of L(r, m). Note that ^7 7^ ^7'. Suppose that 7 and 7' are such that 7^7 
and 7' ^7'. Then it follows that 7^7' because the first m — 1 entries of 7 form 
^7 and the first m — 1 entries of 7' form ^y'. So, / is injective. ■ 

Let a* = max{aj}, = max{m \ 3 5 £ C with ms = m) and f the number of 
non-empty L(r, m)'s. 

Proposition 9 |y(fT(o,B,C))| < 1 + | LA(a, B, C)]$U|(a* - (n* - l)/2)n*. 

Proof: Clearly, v < (a* — (n* — 1) /2)n*. The term 1 is for the sink £. The inequality 
follows from Proposition 8. ■ 

The bound given in Proposition 9 is not tight. In general, the maximum value of 
I h(r,m)\, A, tends to be much smaller than \\A(a,B,C)]'^ nax \. A more informative 
parameter related to the size of H(a, B, C) is \i defined as 

fi=\V(H(a,B,C))\/u. 

For phormas arising in the realm of the applications that we have explored, fi is 
rather small. Given a vertex 7 of this digraph, ^7 and 7 are easily obtainable. So the 
edges of H(a, B, C) do not need to be stored. Each one of the v L(r, m)'s is kept 
as a lexicographically ordered list indexed by an (a* x n*)-array. The (r, m)-entry 
of this array is a pointer to the list L(r, m). A binary search can then be used to 
locate a specific member of L(r, m), when computing h and h^ 1 . 

The amount of work needed to compute ^,(7) is basically proportional to m 7 , the 
length of 7. Indeed, from Proposition 4 we need only to find the m elements of the 
set PostFall(ir-y) and add their orders. These orders are stored at the construction 
of .ff 7 . This makes the time for computing h(a) independent of a*. 

Figure 6 displays basic parameters of various phormas. The following shortcuts are 
used: v G = \V(G(a, B, C))|, v H = \V(H(a, B, C))|, a a = | \A(a, B, C)] a \, a a max = 
\\A{a,B,C)\^ a3:x \. The last column of Figure 6 is 10 4 x d, with d = \A(a,B,C)\/ 
dlieiv a i) the density of (a, B, C). It is interesting to observe how fast the densities 
of the symmetric phormas (the ones with B = B^ m ) go to zero as n increases. 
We present parameters for the phormas (9 n , -B"™, C n ), 2 < n < 9. The boolean 
functions B™ ym for these phormas are obtained from B™^ by replacing the inequal- 
ities > by the strict inequalities >. Thus, only strictly decreasing sequences are 
permitted. Note that A(9 10 , £"> C* 10 ) = 0. 



7 Conclusion 

We have defined a data structure generator which permits the perfect hash of order 
restricted multidimensional arrays A(a, B, C). The restrictions accord a general type 
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Fig. 6. Parameters values for G(a,B,C) of various phormas 



of boolean functions B formed by order restricting pairs of entries of the array in ar- 
bitrary ways. The boolean function B is used in forming a reduced set [A(a, B, C)J , 
inducing a partition of A(a, B, C). An £ [A(a, B, C)\ corresponds to a member 
subset [A(a, B, C, [a\ )J of this partition. The elements of [A(a, B,C,S\), 5 € C, 
are in 1 — 1 correspondence with paths from 5 to the origin in the m^-dimensional 
integer grid digraph J m& , and can be efficiently found in a 5-restricted implicit 
enumeration scheme which produces only reduced sequences. To generate all c € C, 
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which might be itself a boolean function on the 5's, we use the NW-family R°° in a 
C-restricted implicit enumeration search. The whole scheme is summarized by two 
facts: (i) an a G A(a,B,C) induces three pieces of information, \a\, ct" and \a] a 
and is recoverable from the first two, a = ([a], a"); (ii) this decomposition reflects 
in the rank formula for a perfect hashing of A(a, B, C): h(a) = \\[a\ || + hr a ia(cty. 
This encoding scheme has the power of perfectly addressing huge and quite intricate 
arrays A(a, B,C) by means of the logarithmically smaller NW-family G(a, B,C). 
This general type of perfect hash scheme does not seem to have been treated before 
in the literature. In particular, its use in database systems is a possible source of 
relevant applications and remains to be investigated. 
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